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STAT^CONCEPT 

An Interactive Computer Package Supporting a First Course 
in Educational Statistics 

Abstract 

The Statistical Concepts Package (STAT*OONCEPT) is a 
coordinated collection of interactive computer programs and printed 
materials desigi.ed to enrich graduate instruction in educational 
statistics. Piobability, descriptive statistics, sampling 
distributions, and power are four of the thirteen topics included. 
By doinj; the computational labor and providing a responsive 
environaent, the computer frees the student to explore statistical 
concepts in a "ivhat happens if laboratory- type, investigative 

atmosphere. This paper discusses the rationale, content, and 
student -program interaction of the STAT*CONCEPT package. Technical 
aspects as well as the package's usage during the past two years 
are also discussed. 



STAT*CONCEPT 

An Interactive Computer Package Supporting a First Course 
in Educational Statistics 

!• INTRDDUCriON 

The Statistical Concepts Package (STAT^CONCEPT) is a coordinated 
package of interactive computer programs and printed materials. The 
development of STAT*OONCEPT was an outgrowth o£ a perceived need to 
modernize laboratory aspects and enrich instruction in a first 
course in graduate educational statistics. It was developed and 
implemented by the Laboratory ofExperimental Design, Department of 
Educational Psychology at the University of Wisconsin-Madison. Mr. 
George Behr developed the computer programs and Ms. Victoria Petro 
Rubner developed the initial coordinated lesson plans and user 
guides. A grant from the Knapp Bequest allowed Mr. Frank Baker 
to prepare a concise user manual which incorporates both the 
computer program rationale and lesson plans in a thirteen- session 
format. The purpose of the manual and computer programs is to 
involve the xise of computers in teaching statistics on a university- 
wide basis. Implementation of STAT*CONCEPT during the past two 
years has proven it to be a laseful tool in augmenting lecture 
material. 
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2. COl^TENT 



STAT*OONCEPT is designed to supplement instruccion and not to 
teach per se. It is not a spectator activity and requires that 
students become involved in interacting with the computer • At 
present the package consists of thirteen laboratory sessions which 
deal with major statistical constructs and relationships. The 
sessions follow a sequence similar to that found in element iry 
statistics textbooks. A brief description of each session appears 
below. 



Session 1 
Tables 



Session 2 
Probability 



Session 3 

Descriptive 
Statistics 



The primary function of this session is 
to acquaint the student with the computer 
teiminal procedures and foimats employed 
by STAT*CONCEPT. The statistical concept 
of interest is that of the relationship 
between the width of the grouping interval 
employed in constructing frequency 
distributions and the information conveyed 
by the distribution. 

This session provides the student an 
opportunity to apply elementary rules 
of probability to a data set provided 
in the lesson. Such concepts as mutually 
exclusive, independence, and joint events 
are examined. 

This session assists the student in 
developing a correspondence between 
actual data an^l the descriptive statistics 
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Session 4 
Random Variables 



Session 5 

Binomial 
Distribution 



Session 6 
Standard Scores 



which depict certain attributes of that 
data. The student is able to get a "feel" 
for iiifoimation conveyed by the mean, 
median, range, variance and standard 
deviation. 

In this session the student performs coin 
flipping experiments quickly and efficiently. 
This facility provides the opportunity to 
grapple with the concepts of a fair coin, 
random sequencing, expected value and 
discrete binomial random variables. By 
vaiying the nunber of coins tossed and 
the number of trials perfoimed, the student 
observes the effect on the frequency 
distribution of the number of successes. 

The relationship between a particular 
binomial distribution, defined by its 
parameters n and p, and the resulting 
probability distribution is the focus of 
this session. The student varies the 
values of n and p observing the effect 
on the shape of the binomial density. 

This session deals with standard scores. 
The student specifies values for the mean 
and standard deviation of a noimal 
distribution and practices converting 
data values to standard scores and 
conversely, standard scores to data 
values. 



Session 7 
Normal 

Distribution 



Session 8 

Sampling 
Distributions of 
the Mean 



Session 9 

Sampling 
Distribution of 
the Variance 



This session focuses on the development of 
student skill in using tables of the nornal 
distribution. The student specifies values 
for the mean and standard deviation of a 
hypothetical distribution. The computer 
then generates positive or negative z-values 
to which the student must respond with the 
appropriate probability value. Similarly, 
the computer generates a cumilative 
probability value from which the student 
must derive the correct z-value. 

The relationships between the distribution 

of a random variable in a population. The 

distribution of a random variable in a 

sample and the sampling distribution of 

a statistic are examined in this session. 

The program is designed to draw samples 

2 

from a uniform, noimal, binomial, x 
F population distribution. The student 
specifies the distribution type and the 
size and number of random samples to be 
drawn. The computer then draws each sample 
computing its mean and generating an 
empirical sampling distribution of the 
mean. 

In this session, the same procedures as 
those employed in Session 8 are utilized 
to generate empirical sampling distributions 
of both the biased and unbiased estimates 
of the variance. For each empirical 
sampling distribution, the computer calculates 
the mean, variance and standard deviation. 
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Session 10 
Hypothesis Testing 



Session 11 
Power Curves 



Session 12 
The t-test 



The procedures followed in setting up a 
hypothesis testing situation and the 
interdependcncies of the factors involved 
are examined. The program assumes a 
large sample z-test of the hypothesis 
Hq! y = 100. The student specifies 
the significance level, rejection region, 
population, standard deviation, sample 
size and population mean under the 
alternative. The student varies the 
value of one of these quantities, holding 
the other four constant and analyzes the 
effect or the critical valueCs). 

computer performs the calculations 
needed for the student to construct power 
curves for a large sample z-test of the 
hypothesis H^: y = 100. As in Session 10, 
the studeit varies the value of one of the 
five factors, holding the others constant. 
For each variation, the value of power is 
calculated by the computer, the student 
plots the derived values of power and 
analyzes the effects the manipulations 
have on the power curve. 

The focus of this session is significance 
testing using the t- distribution. The 
computer has been programmed to perfoim 
the calculations associated with the 
single sample t-test, two independent 
sample t-test and the correlated (paired) 
t-test. The student is able to explore 
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statistical inference procedures. 



Concepts involved in ANOVA are examined 
by the student. The one-way fijced effects 
A^DVA is used to illustrate sources of 
variation, breakdown of sums of squares, 
fixed effects, and the F tests eoiployed. 

A proposed expansion of STAT*CONCEPT includes adding sessions 
to deal with such topics as regression and correlation, concepts 
usually encouncered in a second educational statistics course. 



Session 13 

Analysis of 
Variance 



3. SrUDENT-PRDGRAM imT3RACTI0N 



Instructions for each session include a worksheet explanation 
of how to use the computer via the terminal, a process flow chart 
which provides an overall diagram for the session, and a set of 
coordinated exercises designed to provide direction for the 
exploration of each statistical concept. (See STAT*CONCEFr, 
Session 5 in the Appendix of this paper.) Students are instructed 
to study the lesson in detail before interacting with the computer. 
This allows them to become familiar with console procedures, the 
sequence of actions involved, the source of the data to be dealt 
with and how the exercises direct exploration of the concepts. Many 
sessions employ a presto red set of data that can be used to illustrate 
the computer procedures as well as provide infomation needed for 
the exercises. Some sessions allow the students to provide the 
data to be analyzed. Various sessions contain extended capabilities 
that allow interested students to go beyond the recjuirements of the 
exercises and explore the concepts in greater depth. 

The student generally deals with each session in a "what 
happens if ..." framework; certain quantities are manipulated over 
a range of values and the effect of the manipulation is observed 
and analyzed with respect to a particular variable. For example, 
in Session 11 power curves are constructed for a large sample 
Z"test of the hypothesis H^: y = 100. The student specifies 
whether the test is two-tailed or one-tailed (and the direction) 



and fixes value:, for a, tho population standard daviation and sample 
size. The alternative mean is then systematica] ly varied and the 
student examines the value of pr^wer for each alternative. Similarly, 
other factors such as a, the population standard deviation or the 
sample size can be varied and the effect on the power curve observed. 



4. TEQiNICM. ASPECTS 



STAT^CONCEPT software is at present written in FORTRAN-V and 
operates on the UNIVAC 1110 at the Madison Academic Computing Center 
(MAOC) of the University of Wisconsin-Madison. Each session's 
coniputer program is an independent, self-contained package and does 
not require secondary storage for operation. The 1110 is a word 
oriented machine in which one word equals 36 bits and 6 bits equal 
one character. 

The programs arc designed to operate with either a standard 
teletypewriter or Hazeltine 2000 alphanumeric display as the 
input-output median. Particular advantage has been taken of the 
Hazeltine's screen control capabilities; e.g., cursor movements, 
print position, program control to roll up or down selected portions 
of the display and variation of screen intensity are used to provide 
effective formatting and infoimation presentation on the alphanumeric 
display. Graphical displays appear with some of the computer programs 
An extension of the display mode could incorporate more sophisticated 
graphical displays in a larger nimber of sessions. Availability 
of a graphics terminal would make such an extension feasible. 

The programs are basically exportable despite naturally existing 
hardware and operating system idiosyncracies. The programs are 
written in FORTRAN V, a UNIVAC extended FORTRAN, except the 
Hazeltine control routine which is written in assembly language. 
The following brief statements describe the nature of the dependencies 



1. Special FORTRAN V capabilities such as ENCODE, DECODE, 
REREAD, etc. are employed by the programs. However, 
their functional counterparts do exist or are 
programmable in other FORTRANS. 

2. Foimat specifications are machine dependent, UNIVAC 

uses FIELDDATA code, a 6-bit character code. Parenthetically, 
Hazeltine uses standard 8-bit ASCII code for control 
characters. Since such a code is not available to 
FORTRAN users on the UNIVAC, an assanbly language code 
is required. It is quite possible that the Hazeltine 
command routine could be -/i It ten in FORTRAN in another 
system. 

3. MACC developed and supported utility routines, e.g., 
centering character strings, providing date and time 
of day, are employed. Statistical routines to generate 
random numbers from specified distributions as well as 
calculating cumulative areas under such distributions 
are utilized. Most academic computing centers should 
contain libraries of equivalent software functions. 

Although conversion efforts would be necessary to use STAT*CONCEPT 
on another computer, the following considerations will aide such 
endeavors : 

1. The source document includes numerous explanatory 
comments. 



2. Each program is built in a modular vath a set of 
subroutines common to many programs. 

3. The STATISTICAL CONCEPTS PACKAGE manual includes 
functional flowcharts for the computer programs which 
support the thirteen sessions. Sample output is also 
included, providing a criteria for debugging. 

Due to differing hardware and operating systems it is difficult 
to extrapolate program operating characteristics and costs to other 
machines. However, a few observations concerning our operational 
experience may provide some idea of the nature of these issues. 
Our experience indicates that a student spends about one -half hour 
per session interacting with the computer. Operationally this 
amounts to five seconds of CPU time and run-time costs of about 
$2 under MACC's billing scheme. Storage in temas of space and cost 
is minimal since the thirteen programs are relatively small. Programs 
are independent and need not be simialtaneously on-line. To further 
minimize storage requirements, it may be desirable to have those 
programs on-line which coincide with lecture material and remove 
them after a suitable time period. 
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5. CONaUSION 

Students enjoy using the coordinated package and find it a 
beneficial means of exploring basic statistical concepts. STAT 'CONCEPT 
is significant because of its utility in modernizing the laboratory 
facet of a first graduate course in statistics. 7/'c can be incorporated 
for use in several ways. The computer supported package can 
constitute the entire laboratory aspect for the course or the 
materials can be used to supplonent existing laboratory experiences. 
Students can individually use various sessions to review and 
explore concepts which are difficult for them to understand. Our 
experiences with STAT*OONCEPT have shown that nonquantitatively- 
oriented students find this application particularly useful. In 
addition to individual use, STAT*CONCEPT sessions have been utilized 
effectively in group situations. Sessions can be simultaneously 
coordinated with class presentation to dramatize and visually 
depict the concepts under consideration. The instructor can perfoim 
the mechanics at the console and direct class analysis of the output. 

Useful criticisms resulting from completion of evaluation 
foims by students have been incorporated into the current revision 
of the original system. Feedback from users continues to be 
solicited as students employ STAT*CDNCEPT individually and in groups 
to aid their understanding of statistical principles and 
relationships . 



APPENDIX 



Session 5 
BINCMIAL DISTRIBUTION 

Introduction 

The binomial distribution is one of several reference distributions 
that are used in statistics • The binanial distribution is actually 
a family of distributions ^ose individual members are defined by 
the values of the parameters N and P. In the present laboratory 
session the relationship between the probability distribution and 
its parameters is examined. For a given set of parameter values 
the computer will evaluate the binomial function over all possible 
outcomes. The resulting probability distribution will then be 
displayed at the conqputer teiminal for your inspection. By 
manipulating the values ,of the parameters , you should develop an 
appreciation for the family of binomial distributions, the role of 
parameters, and gain an understanding of the relationship of the 
shape of the binOTiial density to the values of the parameters. 
The computer program evaluates the fimction rule 

P(X - r) = (?) P^ (l-P)^'^ 0 < X < N 
where: N is the number of observations 

P is the probability of a success 

r is the number of successes in N observations 

X is a randoKi variable 

N and P are the parameters of the binomial distribution. 
Before using the computer terminal, read the complete session 
description and determine what you are to do. 



Computer Teiminal Procedures 

Follow the instructions in the con^wter procedures section to 
turn on the teiminal and log-in with the computer. The XQT statanent 
will be 

@XQr^TAT*CCNCEPT. BDIST 

Step 1 

COIN TOSS OR BINOMIAL? Respond by typing BVAL. 
Step 2 

N ■ ? will be printed. You must provide a value o£ N, the number of 
independent observations, \A\ere 1.0£N£9.0. Be sure to include 
the decimal point. 

Step 3 

P ■ ? will be printed. Type in a value of P, the probability of 
success. The value of P must be a tvo digit decimal number \ihDse 
value is 0.00 < P < 0.99. 

The COTiputer will then evaluate the binomial function over all 
possible values of r for the parameters N and P you have specified. 
The resulting probability distribution will be printed in the form 
shown below. 

For example, if N « 5.0 and P = .38, the corresponding table 
would read: 



R 



P(X - R) 



0 
1 
2 
3 
4 
5 



.091613 
.280750 
.344146 
.210928 
.064639 
.007924 



As expected, the sum of the probabilities is 1.000000. 



step 4 

MORE BINCMIAL EVALUATIONS? will be printed. A YES response will 
retxim you to Step 2 and the Whole process repeated. A NO response 
will result in termination and the end of lesson message will appee^r. 



BINCMIAL DISTRIBirriON 
Pi-ocess Flow 




c om TOSS? yjes ^ (j^^ y^^^ 

^ jBVAL 



N = ? 
1.0 < N < 9.0 



P = ? 
0.00 < P < 0.99 



Display Table 
R P(X=R) 

0 .091613 

1 .280750 




Exercises 



1. Examine the effect of varying P for fixed N. 
Let N « 6. be fixed. Set P » .2 initially. 

a) Sketch the resulting binomial distribution on the figure below. 



P(X«R) 



1.0- 




.9- 




.8- 




.7- 




.6- 




.5- 




.4- 




.3- 




.2- 




.1- 

0 





X = Nlimber of Successes 

b) Use the ccn^niter to obtain tables for N = 6, P = .5 and 
N « 6, P = .8. In each case sketch the resulting binomial 
distribution on the figure above. You should observe three 
major relationships: 

1) As P increases, the largest probability value occurs near NP. 

2) When P « .5, the distribution is syinnetrical with its 
maximum at X « 3. 

3) When P < .5 the ''tail'' of the distribution is to the 
right. When P > .5 the tail of the distribution is to 
the left. 



2. Examine the effect of varying N for a fixed value of P, 
Let N = 3 initially. Let P » ,3 be fixed, 
a) Sketch the resulting binomial distribution, 

1.0- 
.9- 
.8- 
.7- 
.6- 

P(X»R) .5- 
.4- 



.1- 

0 * 1 1 \ 1 1 \ 1 1 1 i 1 » h 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 

X « Number of Successes 

b) The mean of the binomial distribution is NP, which in the 
present case is (3) (.3) « .9, thus the maximum probability 
should occur near X = 1.0. 

c) Perform a series of binomial evaluations with N « 6,, 9. and 
12. for P « .3. 

1) For each distribution, calculate NP and note idiether the 
maximum probability occurs near this value. 

2) Sketch each distribution on the diagram above. 

d) You should observ'e the following: 

1) The location of the distribution is a function of NP, 
hence as san5)le size increases for fixed P, the 
distribution is further to the right of the diagram. 



2) The distribution tends to smooth out as N 

is increased, v^ile its shape is generally the same 
for the values of N involved. 

Now that you have observed the effect of N and P on the binomial 
distribution, explore this relationship in greater detail. 

a) Select a value of N and P; then sketch what you think the 
corresponding binomial distribution should look like* 

b) Enter the value of N and P and compare the obtained 
distriljution with your sketch. 

c) Repeat a,b until you get a good correspondence between the 
two distributions. 



Teletype Output for Session 5 



•XQT STAT«C0.>»CEPT 
AKE /0U USIM A 

T0SS 0K BINSkllAL 



•dOiST 

HA££Lni>JE: 2000 



p 



0 

I 

2 
3 
4 
5 
6 



0 
I 

2 
3 
4 
5 
6 
7 

9 



• 2 

• £!62)44 
•393216 
•245760 
•0d)920 
•01S360 
•00)536 
•000064 



p = 


? 


• 5 


R 




PCX « K> 


u 




• 0 1 S 62 5 


1 




•09 3750 


2 




•234375 


3 




•3)2500 


4 




•23437 5 


5 




.093750 


6 




•01S625 


P o 


7 


• 8 






PCX « K> 


0 




•000064 


I 




•001536 


a 




•01 5360 


3 




•08192U 


4 




•245760 


5 




• 39 32 1 6 


6 




• 262M4 


P = 




• 3 


K 




PCX o R> 


0 




•34 3000 


1 




•441000 






•159000 


3 




•027000 


P = 


7 


• 3 


K 




PCX a K) 


0 




• 1 27649 


I 




•302526 


2 




•324135 


3 




• lti5220 


4 




• 059535 






•010206 


6 




•000729 



»3 

PCX « K> 
•040354 

• 155650 
•2668 28 
•266328 
•171532 
•073514 
•021004 

• 003658 
•000413 
•000020 



f4 liusT be: in rang 

P » 7 .3 

K PCX « H> 

0 ^* 168 0 70 

1 ^3601 50 

2 .• 308700 

3 .132300 

4 •0283S0 

5 ^002430 



BlMOi^lAL rOHi^ULA eVALOATiSiM iM ■ 7 6* 



rt0rtE BIN0,^IAL EVALUATI0^7 fES 

ai.M0.v)XAL F0KMULA E\/ALUAn0i>J N 



t>10KE 81N01V1IAL £^ALUATI0.^7 fES 

BIi>J0MIAL FOKilULA E VALIIATI 01^^ N 



Pl0f^E BIi^BrtlAL Ey/ALUATlO.yJ? fES 

BIi>J0rtIAL POKrtJLA EVALUATION N a 7 



«0KE Bi»M0rtlAL Ev/ALUAri0,>l? YES 

BiNOrtlAL FOKrtULA EVALUATI0N N s ? 



MOKE BiNO.iXAL E\/ALUATI0N? fES 

dit^OiHAL KdKi'lULA EVALUATI0.^ N % 7 



E I 



rtOKE BINOMIAL EVALUATION? fES 

BINOMIAL F0KMULA EVALUATION N 
•0 T0 9^0. TKf AGAIN. N » ? 5^ 



12. 



THE NEXT COMMAND 

A #fiN 0K A a/ixjr 



MOKE BINOMIAL EVALUATION? 
fOU ENTEK SHOULD BE EI THEK 



N0 



